Skip to content

feat: OTEL metrics pipeline for task workers#3061

Draft
ericallam wants to merge 18 commits intomainfrom
feat/otel-metrics
Draft

feat: OTEL metrics pipeline for task workers#3061
ericallam wants to merge 18 commits intomainfrom
feat/otel-metrics

Conversation

@ericallam
Copy link
Member

@ericallam ericallam commented Feb 14, 2026

  • Adds an end-to-end OTEL metrics pipeline: task workers collect and export metrics via OpenTelemetry, the webapp ingests them into ClickHouse, and they're queryable through the existing dashboard query engine
  • Workers emit process CPU/memory metrics (via @opentelemetry/host-metrics) and Node.js runtime metrics (event loop utilization, event loop delay, heap usage)
  • Users can create custom metrics in their tasks via otel.metrics.getMeter() from @trigger.dev/sdk
  • Metrics are automatically tagged with run context (run ID, task slug, machine, worker version) so they can be sliced per-run, per-task, or per-machine
  • The TSQL query engine gains metrics table support with typed attribute columns, prettyFormat() for human-readable values, and per-schema time bucket thresholds
  • Includes reference tasks (references/hello-world/src/trigger/metrics.ts) demonstrating CPU-intensive, memory-ramp, bursty workload, and custom metrics patterns

What changed

Metrics collection (packages/core, packages/cli-v3)

  • Metrics export pipelineTracingSDK now sets up a MeterProvider with a PeriodicExportingMetricReader that chains through TaskContextMetricExporter (adds run context attributes) and BufferingMetricExporter (batches exports to reduce overhead)
  • Host metrics — Enabled @opentelemetry/host-metrics for process CPU, memory, and system-level metrics
  • Node.js runtime metrics — New nodejsRuntimeMetrics.ts module using performance.eventLoopUtilization(), monitorEventLoopDelay(), and process.memoryUsage() to emit 6 observable gauges
  • Custom metrics — Exposed otel.metrics from @trigger.dev/sdk so users can create counters, histograms, and gauges in their tasks
  • Machine ID — Stable per-worker machine identifier for grouping metrics
  • Dev worker — Drops system.* metrics to reduce noise, keeps sending metrics between runs in warm workers

Metrics ingestion (apps/webapp)

  • OTEL endpointotel.v1.metrics.ts accepts OTEL metric export requests (JSON and protobuf), converts to ClickHouse rows
  • ClickHouse schema016_create_metrics_v1.sql with 10-second aggregation buckets, JSON attributes column, 30-day TTL, and materialized views for 1m/5m rollups

Query engine (internal-packages/tsql, apps/webapp)

  • Metrics query schema — Typed columns for metric attributes (task_identifier, run_id, machine_name, worker_version, etc.) extracted from the JSON attributes column
  • prettyFormat() — TSQL function that annotates columns with format hints (bytes, percent, durationSeconds) for frontend rendering without changing the underlying data
  • Per-schema time buckets — Different tables can define their own time bucket thresholds (metrics uses tighter intervals than runs)
  • AI query integration — The AI query service knows about the metrics table and can generate metric queries
  • Chart improvements — Better formatting for byte values, percentages, and durations in charts and tables

Reference project

  • references/hello-world/src/trigger/metrics.ts — 6 example tasks: cpu-intensive, memory-ramp, bursty-workload, sustained-workload, concurrent-load, custom-metrics

Test plan

  • Build all packages and webapp
  • Start dev worker with hello-world reference project
  • Run cpu-intensive, memory-ramp, and custom-metrics tasks
  • Verify metrics in ClickHouse: SELECT DISTINCT metric_name FROM metrics_v1
  • Query via dashboard AI: "show me CPU utilization over time"
  • Verify prettyFormat renders correctly in chart tooltips and table cells
  • Confirm dev worker drops system.* metrics but keeps process.* and nodejs.*

@changeset-bot
Copy link

changeset-bot bot commented Feb 14, 2026

🦋 Changeset detected

Latest commit: a4497ed

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 28 packages
Name Type
trigger.dev Patch
@trigger.dev/sdk Patch
d3-chat Patch
references-d3-openai-agents Patch
references-nextjs-realtime Patch
references-realtime-hooks-test Patch
references-realtime-streams Patch
references-telemetry Patch
@trigger.dev/python Patch
@internal/sdk-compat-tests Patch
@trigger.dev/build Patch
@trigger.dev/core Patch
@trigger.dev/react-hooks Patch
@trigger.dev/redis-worker Patch
@trigger.dev/rsc Patch
@trigger.dev/schema-to-json Patch
@trigger.dev/database Patch
@trigger.dev/otlp-importer Patch
@internal/cache Patch
@internal/clickhouse Patch
@internal/redis Patch
@internal/replication Patch
@internal/run-engine Patch
@internal/schedule-engine Patch
@internal/testcontainers Patch
@internal/tracing Patch
@internal/tsql Patch
@internal/zod-worker Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 14, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds per-column display format hints (ColumnFormatType) through tsql schema and printer and introduces format-aware formatters and utilities (bytes, decimalBytes, percent, duration, cost, quantity) used in charts, legends, tooltips, tables, and big-number cards. Adds per-table timeBucketThresholds and conditional FINAL handling for time-bucketing. Implements ClickHouse metrics support (table migration, insert helper, types), OTLP metrics ingest route, an OTLP->ClickHouse exporter, and extensive OTEL metrics plumbing across SDK, CLI, workers, and environment configuration (env vars, metric exporters/readers, metric buffering/processing, machine id propagation).

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 45.45% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main addition to the codebase: an OTEL metrics pipeline for task workers, which aligns with the primary objective of the PR.
Description check ✅ Passed The PR description is comprehensive and well-structured, covering all major changes across multiple subsystems. However, the description template requires a checklist and testing steps that are incomplete or only partially filled.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/otel-metrics

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (5)
internal-packages/tsql/src/query/printer.ts (3)

743-749: The format property is smuggled via a type assertion — consider a typed return.

analyzeSelectColumn attaches format onto sourceColumn (line 984), but ColumnSchema doesn't declare that field. Here you cast to Partial<ColumnSchema> & { format?: ColumnFormatType } to read it back. This coupling is fragile — a refactor of the return type could silently break the flow.

Consider adding format?: ColumnFormatType to analyzeSelectColumn's return type directly (as a sibling of sourceColumn) rather than embedding it inside the column object:

♻️ Suggested approach
 private analyzeSelectColumn(col: Expression): {
   outputName: string | null;
   sourceColumn: Partial<ColumnSchema> | null;
   inferredType: ClickHouseType | null;
+  format?: ColumnFormatType;
 } {

Then in the prettyFormat branch, return format as a top-level field instead of merging it into sourceColumn. The consumer in visitSelectColumnWithMetadata can then read format directly without a cast.


963-977: validFormats list can drift from ColumnFormatType — consider deriving one from the other.

The hardcoded validFormats array must be kept in sync with ColumnFormatType manually. If a new format is added to the union type, forgetting to update this list would silently reject it at runtime.

♻️ Suggested approach
+const PRETTY_FORMAT_TYPES = [
+  "bytes",
+  "decimalBytes",
+  "quantity",
+  "percent",
+  "duration",
+  "durationSeconds",
+  "costInDollars",
+  "cost",
+] as const satisfies readonly ColumnFormatType[];

 // In analyzeSelectColumn:
-const validFormats = [
-  "bytes",
-  "decimalBytes",
-  "quantity",
-  "percent",
-  "duration",
-  "durationSeconds",
-  "costInDollars",
-  "cost",
-];
-if (!validFormats.includes(formatType)) {
+if (!(PRETTY_FORMAT_TYPES as readonly string[]).includes(formatType)) {

Using satisfies readonly ColumnFormatType[] ensures the compiler flags any value not in the union, catching drift at build time.


2868-2874: Format type validation is skipped when prettyFormat() is used outside SELECT.

In visitCall, the wrapper is stripped without validating the format type string. Validation only happens in analyzeSelectColumn (which only runs for SELECT columns). If a user writes WHERE prettyFormat(x, 'bogus') > 5, it silently passes — no error, but also no harm since the wrapper is stripped. This is likely acceptable for a display-only hint, but worth noting for consistency.

apps/webapp/app/v3/querySchemas.ts (2)

575-584: Inconsistent column types for environment_type and machine_name vs. runsSchema.

In runsSchema, environment_type (line 74) and machine (line 395) use LowCardinality(String). Here, both environment_type and machine_name (line 568) use plain String despite having the same bounded allowedValues. While these are expression columns extracted from JSON, using consistent TSQL types improves schema coherence and could help the query engine optimize accordingly.

Suggested fix
     machine_name: {
       name: "machine_name",
-      ...column("String", {
+      ...column("LowCardinality(String)", {
         description: "The machine preset used for execution",
         allowedValues: [...MACHINE_PRESETS],
         example: "small-1x",
       }),
       expression: "attributes.trigger.machine_name",
     },
     environment_type: {
       name: "environment_type",
-      ...column("String", {
+      ...column("LowCardinality(String)", {
         description: "Environment type",
         allowedValues: [...ENVIRONMENT_TYPES],
         customRenderType: "environmentType",
         example: "PRODUCTION",
       }),
       expression: "attributes.trigger.environment_type",
     },

485-492: Rename metric_subject to be more specific or document its constraints.

metric_subject is consistently populated only with machineId values (lines 502, 527, 554 in otlpExporter.server.ts), yet the ClickHouse column name is generically scoped as metric_subject while the TSQL alias is specifically named machine_id. This naming mismatch could confuse developers querying ClickHouse directly or extending the schema later. Either align the ClickHouse column name to machine_id to match the TSQL side, or add a schema comment clarifying that metric_subject is constrained to machine identifiers.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f3c9dfb and ed7e891.

📒 Files selected for processing (3)
  • apps/webapp/app/v3/querySchemas.ts
  • internal-packages/tsql/src/query/printer.ts
  • internal-packages/tsql/src/query/schema.ts
🧰 Additional context used
📓 Path-based instructions (7)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

**/*.{ts,tsx}: Always import tasks from @trigger.dev/sdk, never use @trigger.dev/sdk/v3 or deprecated client.defineJob pattern
Every Trigger.dev task must be exported and have a unique id property with no timeouts in the run function

Files:

  • internal-packages/tsql/src/query/schema.ts
  • apps/webapp/app/v3/querySchemas.ts
  • internal-packages/tsql/src/query/printer.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

Import from @trigger.dev/core using subpaths only, never import from root

Files:

  • internal-packages/tsql/src/query/schema.ts
  • apps/webapp/app/v3/querySchemas.ts
  • internal-packages/tsql/src/query/printer.ts
**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)

**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries

Files:

  • internal-packages/tsql/src/query/schema.ts
  • apps/webapp/app/v3/querySchemas.ts
  • internal-packages/tsql/src/query/printer.ts
**/*.{js,ts,jsx,tsx,json,md,yaml,yml}

📄 CodeRabbit inference engine (AGENTS.md)

Format code using Prettier before committing

Files:

  • internal-packages/tsql/src/query/schema.ts
  • apps/webapp/app/v3/querySchemas.ts
  • internal-packages/tsql/src/query/printer.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use zod for validation in packages/core and apps/webapp

Files:

  • apps/webapp/app/v3/querySchemas.ts
apps/webapp/app/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

Access all environment variables through the env export of env.server.ts instead of directly accessing process.env in the Trigger.dev webapp

Files:

  • apps/webapp/app/v3/querySchemas.ts
apps/webapp/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

apps/webapp/**/*.{ts,tsx}: When importing from @trigger.dev/core in the webapp, use subpath exports from the package.json instead of importing from the root path
Follow the Remix 2.1.0 and Express server conventions when updating the main trigger.dev webapp

Access environment variables via env export from apps/webapp/app/env.server.ts, never use process.env directly

Files:

  • apps/webapp/app/v3/querySchemas.ts
🧠 Learnings (11)
📓 Common learnings
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger.config.ts : Configure OpenTelemetry instrumentations and exporters in trigger.config.ts for enhanced logging
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/otel-metrics.mdc:0-0
Timestamp: 2026-01-08T15:57:09.323Z
Learning: Applies to **/*.ts : Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/otel-metrics.mdc:0-0
Timestamp: 2026-01-08T15:57:09.323Z
Learning: Applies to **/*.ts : When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/otel-metrics.mdc:0-0
Timestamp: 2026-01-08T15:57:09.323Z
Learning: Applies to **/*.ts : When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `schemaTask()` from `trigger.dev/sdk/v3` with Zod schema for payload validation

Applied to files:

  • apps/webapp/app/v3/querySchemas.ts
📚 Learning: 2026-01-15T11:50:06.067Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-15T11:50:06.067Z
Learning: Applies to internal-packages/database/prisma/migrations/**/*.sql : When editing the Prisma schema, remove extraneous migration lines related to specific tables: `_BackgroundWorkerToBackgroundWorkerFile`, `_BackgroundWorkerToTaskQueue`, `_TaskRunToTaskRunTag`, `_WaitpointRunConnections`, `_completedWaitpoints`, `SecretStore_key_idx`, and unrelated `TaskRun` indexes

Applied to files:

  • apps/webapp/app/v3/querySchemas.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Attach metadata to task runs using the metadata option when triggering, and access/update it inside runs using metadata functions

Applied to files:

  • apps/webapp/app/v3/querySchemas.ts
📚 Learning: 2026-01-08T15:57:09.323Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/otel-metrics.mdc:0-0
Timestamp: 2026-01-08T15:57:09.323Z
Learning: Applies to **/*.ts : Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)

Applied to files:

  • apps/webapp/app/v3/querySchemas.ts
📚 Learning: 2026-01-08T15:57:09.323Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/otel-metrics.mdc:0-0
Timestamp: 2026-01-08T15:57:09.323Z
Learning: Applies to **/*.ts : When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs

Applied to files:

  • apps/webapp/app/v3/querySchemas.ts
📚 Learning: 2026-01-08T15:57:09.323Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/otel-metrics.mdc:0-0
Timestamp: 2026-01-08T15:57:09.323Z
Learning: Reference the schedule engine (internal-packages/schedule-engine/src/engine/index.ts) as a good example of implementing low-cardinality metric attributes

Applied to files:

  • apps/webapp/app/v3/querySchemas.ts
📚 Learning: 2026-01-08T15:57:09.323Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/otel-metrics.mdc:0-0
Timestamp: 2026-01-08T15:57:09.323Z
Learning: Applies to **/*.ts : When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries

Applied to files:

  • apps/webapp/app/v3/querySchemas.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use the `task()` function from `trigger.dev/sdk/v3` to define tasks with id and run properties

Applied to files:

  • apps/webapp/app/v3/querySchemas.ts
📚 Learning: 2026-01-15T11:50:06.067Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-15T11:50:06.067Z
Learning: Applies to **/*.{ts,tsx} : Every Trigger.dev task must be exported and have a unique `id` property with no timeouts in the run function

Applied to files:

  • apps/webapp/app/v3/querySchemas.ts
📚 Learning: 2025-07-12T18:06:04.133Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2264
File: apps/webapp/app/services/runsRepository.server.ts:172-174
Timestamp: 2025-07-12T18:06:04.133Z
Learning: In apps/webapp/app/services/runsRepository.server.ts, the in-memory status filtering after fetching runs from Prisma is intentionally used as a workaround for ClickHouse data delays. This approach is acceptable because the result set is limited to a maximum of 100 runs due to pagination, making the performance impact negligible.

Applied to files:

  • apps/webapp/app/v3/querySchemas.ts
🧬 Code graph analysis (3)
internal-packages/tsql/src/query/schema.ts (1)
internal-packages/tsql/src/query/time_buckets.ts (1)
  • BucketThreshold (22-27)
apps/webapp/app/v3/querySchemas.ts (3)
internal-packages/tsql/src/query/schema.ts (2)
  • TableSchema (359-408)
  • column (423-435)
internal-packages/tsql/src/index.ts (3)
  • TableSchema (116-116)
  • column (83-83)
  • BucketThreshold (139-139)
internal-packages/tsql/src/query/time_buckets.ts (1)
  • BucketThreshold (22-27)
internal-packages/tsql/src/query/printer.ts (4)
internal-packages/tsql/src/query/schema.ts (2)
  • ColumnSchema (53-254)
  • ColumnFormatType (279-298)
internal-packages/tsql/src/index.ts (3)
  • ColumnSchema (110-110)
  • ColumnFormatType (112-112)
  • calculateTimeBucketInterval (138-138)
internal-packages/tsql/src/query/ast.ts (3)
  • Call (482-488)
  • Constant (463-466)
  • Field (468-472)
internal-packages/tsql/src/query/time_buckets.ts (1)
  • calculateTimeBucketInterval (86-100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: sdk-compat / Node.js 22.12 (ubuntu-latest)
  • GitHub Check: sdk-compat / Deno Runtime
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: sdk-compat / Bun Runtime
  • GitHub Check: typecheck / typecheck
  • GitHub Check: sdk-compat / Node.js 20.20 (ubuntu-latest)
  • GitHub Check: sdk-compat / Cloudflare Workers
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (8)
internal-packages/tsql/src/query/schema.ts (2)

273-298: LGTM — ColumnFormatType is well-structured as a string union.

Clean use of a string union type (no enum), good separation of existing render types vs. new format hints, and well-documented.


395-407: LGTM — timeBucketThresholds and useFinal are clean, well-documented additions.

Per-table time bucket overrides and opt-in FINAL for ReplacingMergeTree are sensible schema extensions.

internal-packages/tsql/src/query/printer.ts (3)

1618-1633: LGTM — conditional FINAL based on useFinal is correct.

Good change to make FINAL opt-in per table rather than always applied. The redundant lookupTable call (already called earlier in the same method) is harmless since it's just a map lookup.


3052-3057: LGTM — per-table time bucket thresholds cleanly wired through.


62-62: LGTM — type-only import of ColumnFormatType.

apps/webapp/app/v3/querySchemas.ts (3)

36-36: LGTM — useFinal: true is appropriate for a ReplacingMergeTree table.

This ensures deduplication is applied when querying task_runs_v2.


602-613: Time bucket thresholds are well-structured and appropriate for 10s pre-aggregated data.

The progression is sensible, and the satisfies keyword ensures type safety without losing the literal type. Minor note: with a 30-day TTL on metrics_v1, the thresholds beyond 30 days (90d, 180d, 365d) won't be exercised currently, but they're harmless and provide forward-compatibility.


619-619: LGTM — metricsSchema correctly added to the exported querySchemas array.

Copy link

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 8 additional findings in Devin Review.

Open in Devin Review

Comment on lines +968 to +976
{
key: "TRIGGER_OTEL_METRICS_EXPORT_INTERVAL_MILLIS",
value: env.DEV_OTEL_METRICS_EXPORT_INTERVAL_MILLIS,
},
{
key: "TRIGGER_OTEL_METRICS_EXPORT_TIMEOUT_MILLIS",
value: env.DEV_OTEL_METRICS_EXPORT_INTERVAL_MILLIS,
}
);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 TRIGGER_OTEL_METRICS_EXPORT_TIMEOUT_MILLIS reuses interval value instead of timeout value

Both dev and prod metric environment variable configuration set TRIGGER_OTEL_METRICS_EXPORT_TIMEOUT_MILLIS to the same value as TRIGGER_OTEL_METRICS_EXPORT_INTERVAL_MILLIS. The export timeout and export interval serve different purposes — the timeout controls how long to wait for a single export operation, while the interval controls how often exports occur.

Affected locations and impact

Dev at apps/webapp/app/v3/environmentVariables/environmentVariablesRepository.server.ts:973-974:

{
  key: "TRIGGER_OTEL_METRICS_EXPORT_TIMEOUT_MILLIS",
  value: env.DEV_OTEL_METRICS_EXPORT_INTERVAL_MILLIS, // should be its own timeout value
}

Prod at apps/webapp/app/v3/environmentVariables/environmentVariablesRepository.server.ts:1123-1124:

{
  key: "TRIGGER_OTEL_METRICS_EXPORT_TIMEOUT_MILLIS",
  value: env.PROD_OTEL_METRICS_EXPORT_INTERVAL_MILLIS, // should be its own timeout value
}

The downstream consumer in tracingSDK.ts further clamps: Math.min(exportTimeoutMillis, collectionIntervalMs), which mitigates the impact in most configurations. However, if someone sets a short export interval (e.g., 2s), the timeout would also be 2s, which may be insufficient for slow network conditions.

Impact: Incorrect metric export timeout configuration; mitigated by downstream clamping in most cases.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant